Ceph as a scalable alternative to the Hadoop Distributed File System

نویسنده

Carlos Maltzahn

چکیده

[email protected] THE HADOOP D I S TR I BUTED F I L E System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized Cloud Storage System: A Combination of LDAP Distributed File System

“Cloud computing” gradually flourish, a wide range of distributed storage systems are increasingly diverse, Like of Gluster, Ceph, Lustre, as well as Hadoop, etc.. In this paper, we propose a personal cloud storage system Integrated with pNFS, it can be accessed in parallel for scalable performance. Besides, data backup and failover mechanism are designed. We expect that the function of the pro...

متن کامل

Impact of Single Parameter Changes on Ceph Cloud Storage Performance

In a general purpose cloud system efficiencies are yet to be had from supporting diverse applications and their requirements within a storage system used for a private cloud. Supporting such diverse requirements poses a significant challenge in a storage system that supports fine grained configuration on a variety of parameters. This paper uses the Ceph distributed file system, and in particula...

متن کامل

A Scalable RDF Data Processing Framework based on Pig and Hadoop

In order to effectively handle the growing amount of available RDF data, scalable and flexible RDF data processing frameworks are needed. While emerging technologies for Big Data, such as Hadoop-based systems that take advantages of scalable and fault-tolerant distributed processing, based on Google’s distributed file system and MapReduce parallel model, have become available, there are still m...

متن کامل

Comparing Hadoop and Fat-Btree Based Access Method for Small File I/O Applications

Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications ...

متن کامل

Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search

This paper describes Ivory, an attempt to build a distributed retrieval system around the open-source Hadoop implementation of MapReduce. We focus on three noteworthy aspects of our work: a retrieval architecture built directly on the Hadoop Distributed File System (HDFS), a scalable MapReduce algorithm for inverted indexing, and webpage classification to enhance retrieval effectiveness.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Ceph as a scalable alternative to the Hadoop Distributed File System

نویسنده

چکیده

منابع مشابه

Personalized Cloud Storage System: A Combination of LDAP Distributed File System

Impact of Single Parameter Changes on Ceph Cloud Storage Performance

A Scalable RDF Data Processing Framework based on Pig and Hadoop

Comparing Hadoop and Fat-Btree Based Access Method for Small File I/O Applications

Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search

عنوان ژورنال:

اشتراک گذاری